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Abstract. A dynamic search window adjustment for block-matching 
algorithm (BMA) based on the block similarity is presented to reduce 
the computational complexity of full search BMA. The adjustment of 
the size of the search window is performed in three steps: (1) set a 
new search origin based on the block similarity and the displaced 
block difference (DBD), (2) adjust the size of search window in in- 
verse proportion to block similarity, and (3) update the thresholds for 
accommodation to a given image sequence. The technique can be 
easily applied to full search BMA and several fast search algorithms 
to get more efficiency and to reduce a possibility of falling into a 
local minimum. Experimental results show that the proposed tech- 
nique has a good MSE performance and reduces the number of 
search points substantially. © 1998 SPIE and tS&T. 
[S101 7-9909(98)02303-4] 



1 Introduction 

Motion compensated prediction plays a very important role 
in the efficient coding of video sequences. MPEG- 1, 2.' 
ITU-T H.26l. : and H.263* adopted the method to reduce 
temporal redundancies which reside in successive frames. 
Most of the algorithms developed for motion estimation so 
far use a block-based technique called block-matching al- 
gorithm (BMA), that estimate the motion vector (MV) 
block-by-block. In BMA. a frame is divided into nonover- 
lapping blocks with (/Vx/V) pixel size. A block of pixels 
(called a current block) in the current frame is compared 
with its corresponding blocks (called candidate blocks) 
within a search area of size ( ;V + 2 w ) X ( ;V + 2 ir ) in the 
reference frame, where u is the maximum displacement of 
the MV. The MV of the current block is obtained when the 
best matched candidate block is found. The general ap- 
proach for BMA is to use full search block-matching algo- 
rithm (FSB MA). In ISBMA. all possible <2u - I ) : candi- 
date blocks are compared to obtain the best matched 
block/ 
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Because of the intensive computation to get MV or dis- 
placement of the block that has the smallest distortion tunc- 
tion of the matching criterion in FSBMA, many fast search 
algorithms such as three step search (TSS). new three step 
search (NTSS). h 2D logarithm search. 7 one at a time search 
algorithm (OTS).* ID full search. 9 cross search algorithm 
(CSA). 10 and parallel hierarchical ID search (PHODS), 11 
etc.. have been investigated. But there is a critical problem 
with these techniques: falling into a local minimum, owing 
to the assumption that the distortion increases monotoni- 
cally as the searched point moves away from the position of 
minimum distortion. 

Another approach to reduce the computational complex- 
ity is an adjustment of the size of the search window u\ 
which has been suggested in Refs. 12 and 13. In Ref. 12. a 
method to reduce the size of the search window in TSS 
according to the magnitude of the DBD was presented. The 
approach proposed in Ref. 13, the size of the search win- 
dow is determined in proportion to the DBD of blocks. The 
method, called an adaptive adjustment of search window 
(AASW) for BMA, exploits the motion correlation of spa- 
tially neighboring blocks to determine the search origin and 
adjusts the search range according to the different motion 
content of the block. The scheme is performed in three 
stages: (I) set a search origin. (2) determine the size of 
search window, and (3) update the thresholds for classifi- 
cation of motion contents of block frame by frame. The 
search origin is determined with motion vectors of the ad- 
jacent blocks in the left, upper-left, and upper directions 
together with zero displacement to predict a mot inn vector 
of the current block: then, a vector that has the minimum 
DBD is selected as a predicted motion vector. The location 
pointed to by the predicted motion vector is used as a 
search origin for h'SBMA. After setting of the search ori- 
gin, the size of the search area is determined b\ considering 
the DBD in a position of the new search origin. The DBD 
is used as an indicator of the degree of motion for a given 
block. It was proposed that the DBD be used as the starting 
search point as a criterion to identify the motion clas> of the 
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(a) Miss America 




(b) Carphone 

Fig. 1 The distribution of the DBD and the mean DBD along the x t y components of MVs. 



block. Three motion classes — low motion, medium motion, 
and high motion — are defined according to the magnitude 
of the DBD and their maximum displacements are set to 
w/4, u72, w\ respectively. 

The search window adjustment proposed by Feng 
et aL 13 has some problems. First, they assume that the mag- 
nitude of MV is proportional to the magnitude of the DBD. 
That is, if the best matched candidate block in the reference 
block is far away from the search origin, then, the DBD 
becomes very' large. But, with the real test images, we can 
see that there is no significant correlation between them 
(see Fig. 1). In Fig. 1, the DBD distribution and the mean 
DBD value obtained by FSB MA with two video sequences 
are plotted along the displacement of horizontal and verti- 
cal directions, respectively. At each location of the MV, the 
mean DBD values are similar without regard to the magni- 
tude of the MV component. For example, the mean DBDs 
at location ( — 15.— 15), (-15,15), (15,-15). and (15,15) 
are similar to the mean DBD at location (0,0). In general, 
since the DBD is very large in the area of the motion 
boundary, the DBD is dependent on the content of the 
block rather than the magnitude of the MV. That is. the 
DBD is not significantly affected by the displacement of 
the MV. On the other hand, the adjacent blocks belong to 
the same moving region have very similar motion field. So 
the spatial correlation of adjacent blocks can be used for 
determination of the size of the search area to reduce the 
computational complexity. 

In addition to the video sequences for the low bit- rate 
applications such as video phone or video conferencing are 



gentle, smooth, and vary slowly. Thus, the MV distribution 
of the best-matched block is center biased. Figure 1 shows 
two facts: (1) the MV distribution is center biased and (2.) 
the DBD does not highly correlate to the magnitude of the 
MV. If the spatial correlation between adjacent blocks is 
used well, the mismatch of the DBD and magnitude of the 
MV is remedied and more reduction of computational com- 
plexity can be achieved by prediction of the initial MV by 
using motion displacement of an adjacent block. 

In this paper, we propose a new BMA with dynamic 
adjustment of search window (DASW) to overcome the 
complexity of FSBMA. In the DASW algorithm, the new 
search origin for current block and the size of search area 
for MV are determined by considering block similarities, 
MV correlations of adjacent blocks, and their DBDs. Then, 
the general mechanism of FSBMA or several other fast 
search algorithms can be applied within the determined dis- 
placement. 

This paper is organized as follows. In Sec. 2. we de- 
scribe the detailed algorithm of the DASW to determine the 
size of the search window for each block. In Sec. 3, experi- 
mental results to compare with the performance of methods 
for search window adjustment are presented using several 
test video sequences for low bit-rate video applications 
such us video phone or video conferencing. 

2 Dynamic Adjustment of the Search Window 
for BMA 

In video sequences, especially for the low bit- rate applica- 
tions such as vide;) phone or video conferencing, the mo- 
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Algorithm : Dynamic Adjustment of Starch WindowfDASWl 
step 1 : St-gmeni thf current frame using a simple split algorithm, 

step 1 : For a given current block, compuic block similaniies between the bluek and upper, left, upper-left, 
upper-right block*. Then, mark the biock that ha> the largest similarity with the current block. The block 
similarity ■* calculated by using the segmentation information of the current frame As the ratio of the 
number of pivels in rwo blocks belong to the name region to the number of pnels- in a block: 

step 3 : Calculate the DBD of the current block at location pointed to by the MV of an adjacent block which 
has the largest block simiUntyfcalled DFtD^M\d al zero displaced locitionfcatled DDD tnQ y. 

Step A : Set a new search origin as either the position pointed to by MV of the adjacent block or zero displaced 
position, that has smaller DBD of the two DliD. xi , and Dnb :n s .; 

Step 5 : Determine the search window size as follows- 

Step 5.1 : If the new search origin is set in a location pointed to by MV of the adjacent block \DOD t<S Oi > 
DBD**,), the size of search window is adjusted by considering block »imiUnty, i.e. inversely 
proportional to block similarity; 

step 5.2 : [f the new search ongin is set as zero dir>placed position(Df?O |()0! < DDD^ d .), the size of 
search window is determined to be (block similarity) «A/ox{r. y| MV component of the adjacent 
block 1 + it * 0 - (block similarity!!. Where, tutsan initial d isplacement; 

Step 6 : Perform DMA at new search origin with the adjusted displacement; 

Step 7 : L'pdare the threshold 7",., for segmentation by considering the number of matching blocks frame bv 
frame; 



Fig. 2 The DASW algorithm for determination of the size of a search window based on the block 
similarity and 080s. 



tion field is smooth and changes slowly frame by frame. 
The correlation between MVs for adjacent blocks is very 
high if each block belongs to the same object because an 
object spans several blocks. Also, as shown in Fig. I. there 
are many blocks such that their MVs are near the search 
origin in video sequences. 

Based on these facts, we present a new BMA with 
DASW, which exploits motion structures of objects to re- 
duce the number of matching blocks (candidate blocks). 
The DASW algorithm is described briefly in Fig. 2. Be- 
cause each motion displacement of the block is greatly re- 
lated to the moving objects in successive video frames, if 
some blocks belong to the same object region, they have 
similar motion displacements and DBDs. In the DASW, we 
take advantage of motion structures of objects to determine 
the size of the search window by considering the block 
similarity which is computed using segmentation informa- 
tion of a given frame. 

2.1 Set New Search Origin 

In our approach, we consider the block similarity of spa- 
tially adjacent blocks to determine the size of the search 
window. In video sequences, if some adjacent blocks are 
contained in the same object, the motion structures of the 
blocks are very similar. Therefore, we can predict an initial 
MV as the MV of an adjacent block that is significantly 
related to the current block and refine the MV' at that posi- 
tion with a smaller displacement. To determine the block 
similarity of adjacent blocks, we use the segmentation in- 
formation of a given current frame. In general, segmented 
regions are expected to have homogeneous characteristics 
such as intensity and texture that are different in each re- 
gion. These characteristics form the feature vectors that are 
used to discriminate one region from the other. The features 
are employed during the segmentation procedure in the 
checking region homogeneity. Many techniques for image 
segmentation are summarized in the literature. 17 In the 
DASW. we apply a simple method using a region splitting 
algorithm for segmentation. 1. The region splitting algo" 
rilhm i.s a top-down approach and it starts with the assump- 



tion that the entire image is homogeneous. If this is not 
true, the image is split into four subimages. This splitting 
procedure is repeated recursively until homogeneous image 
regions are encountered. The homogeneity is checked 
whether the pixel difference in a region is greater than a 
given threshold 7~ seq or not. 

In the DASW scheme, we use the segmented frame in- 
formation together with MVs of adjacent blocks and their 
DBDs for setting the new search origin. First, a new search 
origin is selected among blocks which are displaced as 
much as zero or MV of its neighboring block by consider- 
ing the block similarity. The block similarity is calculated 
using the segmentation information of the whole current 
frame as how many pixels in two blocks belong to the same 
region over the number of pixels in a block. A new search 
origin is set as either the zero displaced or the displacement 
as much as the MV of an adjacent block with the maximum 
block similarity. Among the two candidates, the displace- 
ment of the block that has smaller DBD is taken as the new 
search origin. Let DBD [00) be the DBD of zero displaced 
block and DBD Jdj be the DBD of the block that has maxi- 
mum block similarity to the current block. If DBD t()0l is 
smaller than DBD :kJj , the zero displacement is selected as 
the new search origin, otherwise, the location pointed to by 
the MV of an adjacent block with the maximum block simi- 
larity is used as the new search origin for BMA. 

In the presented method, the possibility of falling into a 
local minimum can be reduced by using the motion direc- 
tion of the object. If the adjacent blocks contain the same 
object as the current block, the block which contains the 
largest part of the same object is selected and the new 
search origin is set by its MV. By taking advantage of the 
correlations of adjacent blocks, the search area can be re- 
duced, i.e.. cutting down the number of matching candidate 
blocks with opposite motion direction. 

2.2 Search Window Adjustment 

Alter setting the new search origin, the size of search win- 
dow of each block is determined by considering the block 
similarity and MV of the adjacent block. 
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The size of [he search window for each block is deter- 
mined at step 5 in Fig. 2. The maximum displacement of 
the MV is determined in some different ways according to 
the new search origin at step 4. When the new search origin 
is set as the position pointed to by the VI V of the adjacent 
block ( DBD :uJj <DBD, lU ,,), the displacement is adjusted 
only considering the block similarity. In the DASVV. we 
identify how the new search origin is well predicted using 
block similarity. Three classes of block, poor-, medium-, 
and well-predicted, are defined as follows: 

The block is 

(poor-predicted. if block similarity^ 7 l(1VK . 
medium-predicted, if 7 ]ow <block similarity ^ 7 hish . 
we II -predicted. if block similarity > 7" hich . 

(I) 

where 7~ h)W . 7~ hlgh are thresholds for classifying the block. 
The maximum displacements are set as n- for the poor- 
predicted, u72 for the medium-predicted, and u74 for the 
well-predicted block, where w is an initial displacement of 
MV. For the well -predicted block, the current block is very 
similar to the adjacent block, and their motion displacement 
are also very similar. The best-matched block, therefore, is 
placed near the new search origin. This can be used to 
reduce the size of the search window . In the poor-predicted 
case, we can derive an opposite result. 

Second, if the new search origin is set by zero displaced 
block (DBD.ooj^DBD^jj), the displacement is determined 
as follows: 

displacement 

= (block similarity )XMax{.v. v| MV component of the 
adjacent block} + w X (I- block similarity). (2) 



In this case, the motion vector lies between 0 and \\\ where 
t.wyi is MV for the adjacent block that has the largest 
block similarity. DBD l00 ,=s DBD adj means that the best 
matched candidate block may exist in the center-biased lo- 
cation. 

For each block, the size of the search window is deter- 
mined, and then the BMA is performed as a conventional 
one. This strategy can also be applied to some fast search 
algorithms such as TSS. NTSS. 2D-LOG. etc. 

2.3 Update the Thresholds 

In the DASW algorithm, three thresholds. 7 lnu . 7 hls:h for 
classifying the block at the search window adjustment, and 
7\eg for image segmentation are used. Each threshold is 
closely related to the computational complexity and the 
performance of the proposed algorithm. The threshold 7", (m 
is used to classify the new search origin of the block into 
poor-predicted or medium-predicted and the threshold 7* hi ^ h 
is used for classifying the new search origin into medium- 
predicted or well-predicted. In our algorithm, we tixed the 
7~ 1(1W and r hiyh to 309c and 7(Xf , respectively. The thresh- 
olds 7~ Mtt . and 7* hll . h are derived from the relation between 
the block similarity and the differences of MV components 
of the two blocks. With some test sequences, we can find 
out that the blocks with high block similarity have very 
similar MVs of the blocks and estimated the thresholds 
7 hjgh and 7 low by considering block similarities and corre- 
lations of MVs. The threshold T Sc . g for image segmentation, 
is dynamically changed by considering the computational 
complexity, that is. the number of matching candidate 
blocks. If the computational complexity is higher than that 
of the expected number of matching blocks, the threshold is 
increased in proportion to the rate of increasing computa- 
tional complexity such as 



t". ; 1 + M X 



(Number of matching blocks) -(Nu mber of the expected matching blocks) 
(Number of the expected matching blocks) 



(3> 



r 



where. M is the mean of the given frame, The initial thresh- 
old 7~ SCi . is set to a half of A?. With this updating policy, we 
can get MV with the expected computational complexity 
approximately. 

3 Experimental Results 

In the experiments, three video sequences (176 
pels X 144 lines and 30 frames/s) are tested. The two se- 
quences. /V//.Y.Y America and Susie, contain a speaker with 
slow movements, which are typical in video phone or video 
conferencing. The sequence Carphone has a moderate mo- 
tion field in automobile. The mean absolute error (MAE) 
distortion function is used as a matching criterion for BMA. 
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A motion vector search is based on the luminance compo- 
nent with the search block of 16 by 16 with a displacement 
of 16. The MSE per pixel of prediction errors is taken as 
the measure of performance. The number of search points 
which is computed by counting the matching blocks tor 
each block is used to compare the computational complex- 
ity of each method. Blocks in image boundary have a 
restrictive size of the search window in one or two direc- 
tions. For example, a block located at (0.0) has the search 
window ranging from (0.0) to I A ; + u ,N + w ) not from 
(-w.-w) to i .'Y'-ru -./V-r u ). With the three test sequences, 
we have compared the performance of the search window 
adjustment methods — a conventional method (coin.) where 
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Table 1 Average MSE per pixel and the number of search points 
{NSP) per block for each method applied to FSBMA with maximum 
motion displacement w - 16. 



Miss America Susie Carphone 



BMA method 


MSE 


NSP 


MSE 


NSP 


MSE 


NSP 


FSBMA-conv. 


6.96 


886 


44.62 


886 


66.10 


886 


FSBMA-AASW 


7.29 


415 


48.15 


483 


70.06 


496 


FSBMA-DASW 


7.02 


376 


44.76 


462 


67.24 


498 



Table 2 Average MSE per pixel and the number of search points 
(NSP) per block for each method applied to TSS with maximum 
motion displacement w- 16. 



Miss America Susie Carphone 



BMA method 


MSE 


NSP 


MSE 


NSP 


MSE 


NSP 


TSS-conv. 


16.69 


28 


65.29 


29 


92.54 


29 


TSS-AASW 


13.17 


32 


58.52 


32 


87.61 


32 


TSS-DASW 


11.22 


18 


56.25 


19 


85.19 


20 



Table 3 Average MSE per pixel and the number of search points 
(NSP) per block for each method applied to NTSS with maximum 
motion displacement w= 16. 



Miss America Susie Carphone 



BMA method 


MSE 


NSP 


MSE 


NSP 


MSE 


NSP 


NTSS-conv. 


1 1.42 


17 


57.76 


19 


76.76 


18 


NTSS-AASW 


11.56 


21 


58.22 


21 


76.73 


20 


NTSS-DASW 


9.42 


17 


53.87 


18 


71.92 


18 



the size of the search window is set by initial displacement, 
the scheme AASW proposed by Feng er ai., l} and proposed 
scheme DASW— which are applied to FS. TSS. and NTSS. 

The average performances are presented in Tables 1 . 2. 
and 3 with the three test sequences. In FSBMA with the 
threesearch window adjustment methods, the proposed 
scheme DASW has fewer search points by about 50 c < than 
conventional FS without great loss of MSE performance. 
Also, compared with the AASW scheme proposed by Feng 
el <;/..' the DASW approach has better MSE performance 
with a similar number of search points. Table 2 shows the 
average performances for TSS. From the table, the effi- 
ciency of the DASW is more clear than the other two 
schemes in the number of search poinis and the MSH per- 
formance. The number of search points is reduced by 309r 
to 409c. and the MSE per pixel is better by about 39^ to 
\Q9c. With the Susie sequence, the performance gain is no- 
ticeable. In NTSS. the performance gain is smaller than that 
of TSS because NTSS has been developed based on the 
center-biased motion vector in smoothly varying image se- 
quences. The number of search points of the DASW 
method is similar to that of the original NTSS. but the 
DASW shows a better MSE performances with the three 
video sequences. The performances of the MSE and the 
number of search points are better than those of AASW. 
From the three video sequence, the performance gain is 
great in a video sequence with some moderate motion fields 
rather than that with the stationary motion held such as the 
A//.v.v America sequence. 

Detailed MSE performances per pixel of the two se- 
quences Susie and Carphone are shown in Figs. 3. 4, and 5. 
In our scheme, an object with high motion is very well 
predicted since block similarity is calculated from the 
object-based region which is preprocessed for BMA. Also, 
falling into a local -mini mum is overcome by using the 
object- based motion structures. 




(a) bus,e (b) Carphone 

ig. 3 MSE performances of the three methods applied to FS with maximum displacement 16. 
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(a) Susie (b ) Carphone 

Fig. 4 MSE performances of the three methods applied to TSS with maximum displacement 
w= 16. 




As the simulation results have shown the low computa- 
tional complexity of the proposed method, it can be used 
for low complexity video coding applications such as video 
phone or video conferencing. 

4 Conclusion 

In this paper, we propose a dynamic search window adjust- 
ment technique for BMA to reduce the computational com- 
plexity of FSB MA. and to overcome the problem of falling 
into the local minimum in several fast search algorithms. 
The proposed method uses a block similarity and a DBD to 
adaptively adjust the size of the search window. The tech- 
nique also can be easily applied to FSB MA and .several fast 
search algorithms to get more efficiency. The experimental 
results have shown that the MSE performance and the re- 
duction of the number of search points with the proposed 
DASVV scheme are better than those of conventional and 
the AASW. The simulation results have proved the DASVV 
scheme can be used for low complexity video codec appli- 
cations such as video telephony or video conferencing. 
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